Menu Top
Complete Course of Mathematics
Topic 1: Numbers & Numerical Applications Topic 2: Algebra Topic 3: Quantitative Aptitude
Topic 4: Geometry Topic 5: Construction Topic 6: Coordinate Geometry
Topic 7: Mensuration Topic 8: Trigonometry Topic 9: Sets, Relations & Functions
Topic 10: Calculus Topic 11: Mathematical Reasoning Topic 12: Vectors & Three-Dimensional Geometry
Topic 13: Linear Programming Topic 14: Index Numbers & Time-Based Data Topic 15: Financial Mathematics
Topic 16: Statistics & Probability


Content On This Page
Methods of Measuring Trend: Overview and Comparison Freehand Curve Method (Graphical Method) Method of Semi-Averages
Moving Average Method (Calculation and Smoothing) Method of Least Squares (Fitting a Straight Line Trend) Method of Least Squares (Fitting a Parabolic Trend)


Methods of Measuring Secular Trend



Methods of Measuring Trend: Overview and Comparison

Identifying and isolating the **secular trend** ($T_t$) is a crucial step in time series analysis. The trend represents the smooth, long-term movement of the series, abstracting from the shorter-term seasonal ($S_t$), cyclical ($C_t$), and irregular ($I_t$) variations. Various statistical and graphical methods have been developed to estimate this underlying trend component. The selection of a particular method depends on factors such as the visual pattern of the data, the assumed nature of the trend (e.g., linear or non-linear), the ease of computation, and the purpose of the analysis (e.g., descriptive analysis vs. forecasting).

The primary methods commonly used for measuring and estimating the trend component in a time series are:

  1. **Freehand Curve Method (Graphical Method):** A subjective method involving drawing a smooth curve visually through the plotted data.
  2. **Method of Semi-Averages:** A simple algebraic method that fits a straight line by averaging data in two halves of the series.
  3. **Moving Average Method:** A smoothing technique that replaces each data point with the average of itself and a fixed number of surrounding points.
  4. **Method of Least Squares:** A statistical method that fits a mathematical curve (like a straight line, parabola, etc.) to the data by minimizing the sum of squared deviations.

Each method has its own procedure, advantages, and limitations. Understanding these helps in choosing the most appropriate technique for a given time series dataset.

Comparison Overview

Here is a summary comparing the key characteristics of these four main methods for measuring trend:

Method Nature / Approach Key Advantages Key Disadvantages
Freehand Curve Subjective, Visual, Graphical. Drawing a smooth curve through data points.
  • Extremely **simple and quick** to apply.
  • Does not require any calculation.
  • Highly **flexible**; can visually approximate any type of trend (linear or non-linear).
  • Highly **subjective**; results depend heavily on the analyst's judgment.
  • **Not mathematical**; does not provide an equation.
  • Cannot be reliably used for **forecasting**.
  • Lacks precision.
Method of Semi-Averages Objective, Algebraic. Fits a straight line by averaging data in two halves.
  • Relatively **simple and easy** to calculate.
  • **Objective**; yields a unique trend line for a given dataset.
  • Provides a **linear equation** for the trend.
  • Strictly **assumes a linear trend**; not suitable for non-linear patterns.
  • Highly **affected by extreme values** in the first or second half.
  • Uses only two summary points (averages) to determine the line, ignoring variation within halves.
Moving Average Method Objective, Smoothing Technique. Averages data points over a fixed window.
  • Conceptually **simple** and easy to understand.
  • Effective at **smoothing out seasonal and irregular** variations.
  • **Flexible**; can follow the general contour of complex or non-linear trends.
  • Does not assume a specific mathematical form for the trend.
  • **Loses data points** at the beginning and end of the series.
  • Does **not yield a mathematical equation** for the trend curve.
  • The choice of the averaging **period is crucial** and can be subjective; inappropriate period can distort the trend.
  • May not perfectly remove cyclical components if their period overlaps with the moving average period.
Method of Least Squares Objective, Statistical Curve Fitting. Fits a mathematical curve by minimizing squared errors.
  • Considered the **best fit** according to the statistical criterion of minimizing the sum of squared deviations.
  • Provides a **mathematical equation** ($T_t = f(t)$) for the trend.
  • Allows for **reliable forecasting** (extrapolation) using the equation.
  • Foundation for more advanced regression-based time series models.
  • **Objective**; yields a unique curve for a given equation form.
  • **More complex calculations** compared to other methods.
  • Requires assuming a **specific functional form** (linear, quadratic, exponential, etc.) for the trend, which might not always be appropriate.
  • Can be **sensitive to extreme values** (outliers), particularly in smaller datasets.
  • Requires data transformation for some non-linear forms (like exponential).

For most analytical and forecasting purposes, the Method of Least Squares is preferred due to its objectivity, mathematical basis, and ability to provide an explicit trend equation. However, moving averages are widely used in decomposition for their smoothing properties, while the other two methods are simpler but have significant limitations.


Freehand Curve Method (Graphical Method)

Concept

The Freehand Curve Method is the simplest and most elementary technique for estimating the trend component of a time series. It is a purely graphical method and relies on the analyst's visual judgment to draw a smooth curve or line that represents the underlying long-term movement of the data, while intentionally ignoring the short-term fluctuations (seasonal, cyclical, and irregular).

The underlying idea is that if you plot the data and visually average out the ups and downs, the resulting smooth line should approximate the general direction or path the series is following over the long run.

Procedure

The steps to apply the freehand curve method are straightforward:

  1. **Plot the Time Series Data:** Create a line graph of the time series with time ($t$) on the horizontal axis and the observed values ($Y_t$) on the vertical axis.
  2. **Visual Inspection:** Carefully examine the pattern of the plotted points. Look for the overall upward or downward slope and any apparent curvature.
  3. **Draw the Curve:** Using a pen or a drawing tool on the graph, draw a smooth curve (it could be a straight line if the trend appears linear) that passes through the plotted data points. The curve should be drawn such that:
    • It follows the general direction of the data.
    • It smoothes out the peaks and troughs of the short-term fluctuations.
    • Ideally, approximately equal numbers of data points should lie above and below the drawn curve throughout its length.
    • The sum of the vertical distances of the original data points from the drawn trend curve should be close to zero (positive deviations cancelling out negative ones).
  4. **Interpret the Curve:** The drawn curve represents the estimated trend line for the time series. Its slope and curvature indicate the nature of the long-term change.
Time series data plotted with a smooth freehand curve drawn through it representing the trend.

*(Image shows a time series plot with visible fluctuations, and a smoother curve drawn through the center of the fluctuations, indicating the general upward trend.)*

Advantages

Disadvantages

In summary, while the freehand curve method provides a quick visual impression of the general trend, its inherent subjectivity makes it unsuitable for formal time series analysis, estimation, or forecasting. It is primarily a preliminary tool for initial exploration.


Summary for Competitive Exams - Methods of Measuring Trend (Overview & Freehand)

Measuring Trend ($T_t$): Estimating the long-term direction of a time series.

Main Methods:

  1. Freehand Curve (Graphical)
  2. Method of Semi-Averages (Simple Algebraic)
  3. Moving Average (Smoothing)
  4. Method of Least Squares (Statistical Curve Fitting)

Freehand Curve Method:

This method is only for preliminary visual assessment, not for formal analysis.



Method of Semi-Averages

Concept

The Method of Semi-Averages is a straightforward and objective technique for fitting a **straight line trend** to a time series. Unlike the freehand curve method, it provides a unique result for a given dataset. The underlying concept is to divide the time series into two equal (or approximately equal) halves and then find a single representative point for each half by calculating the arithmetic mean of the data values in that half. A straight line is then drawn through these two representative points.

This method simplifies the data by reducing each half of the series to a single average value plotted against the midpoint of the time periods it covers. The line connecting these two points is considered the estimated linear trend.

Procedure

The steps for applying the Method of Semi-Averages are as follows:

  1. **Divide the Data:** Split the entire time series data into two equal halves based on the number of time periods ($n$).
    • If the number of time periods ($n$) is **even**, divide the data exactly into two halves, with $n/2$ periods in each half.
    • If the number of time periods ($n$) is **odd**, the data cannot be divided exactly into two equal halves. In this case, the value for the **middle time period** is usually omitted, and the remaining $n-1$ data points are divided into two equal halves of $(n-1)/2$ periods each. The middle period is excluded to ensure that the two halves have the same number of observations.
  2. **Calculate Semi-Averages:** Compute the arithmetic mean (average) of the observed values ($Y_t$) for each of the two halves. Let these averages be denoted as $\bar{Y}_1$ (for the first half) and $\bar{Y}_2$ (for the second half).
  3. **Determine Time Points for Plotting:** Identify the time point that corresponds to the **center (middle)** of the periods included in each half.
    • For the first half, the center time point ($t_1$) is the mid-point of the time periods in that half.
    • For the second half, the center time point ($t_2$) is the mid-point of the time periods in that half. For example, if a half covers years 2018, 2019, 2020, the center is 2019. If a half covers 2021, 2022, 2023, 2024, the center is $(2022+2023)/2 = 2022.5$.
  4. **Plot the Semi-Average Points:** Plot the two points on the time series graph: $(t_1, \bar{Y}_1)$ and $(t_2, \bar{Y}_2)$.
  5. **Draw the Trend Line:** Draw a straight line passing directly through these two plotted points. This straight line represents the estimated linear trend using the method of semi-averages.
  6. **Find the Equation of the Trend Line (Optional but Recommended):** If needed, the equation of the straight line can be determined. A straight line has the form $T_t = a + bt$, where $T_t$ is the trend value at time $t$, $a$ is the intercept, and $b$ is the slope.
    • The slope ($b$) is the change in $Y$ divided by the change in time between the two points: $$b = \frac{\bar{Y}_2 - \bar{Y}_1}{t_2 - t_1}$$
    • The intercept ($a$) can be found by substituting one of the points $(t_1, \bar{Y}_1)$ or $(t_2, \bar{Y}_2)$ into the equation $T_t = a + bt$ and solving for $a$. For example, using $(t_1, \bar{Y}_1)$: $\bar{Y}_1 = a + b t_1 \implies a = \bar{Y}_1 - b t_1$. The time variable $t$ can be the actual year or a coded time variable (e.g., $t=1$ for the first year, $t=2$ for the second, etc.). Using coded time often simplifies calculations.

Example 1. Find the trend line using the method of semi-averages for the following data and write its equation:

YearValue ($Y_t$)
2018100
2019110
2020105
2021120
2022115
2023130

Answer:

Given: Time series data for 6 years.

To Find: Trend line using Semi-Averages method and its equation.

Solution:

1. Divide Data: The number of years is $n=6$, which is even. Divide the data into two equal halves of $6/2 = 3$ years each.

  • First Half: Years 2018, 2019, 2020 (Values: 100, 110, 105)
  • Second Half: Years 2021, 2022, 2023 (Values: 120, 115, 130)

2. Calculate Semi-Averages:

  • Average of the first half: $\bar{Y}_1 = \frac{100 + 110 + 105}{3} = \frac{315}{3} = 105$.
  • Average of the second half: $\bar{Y}_2 = \frac{120 + 115 + 130}{3} = \frac{365}{3} \approx 121.67$.

3. Determine Time Points for Plotting:

  • Center of the first half (2018, 2019, 2020) is the middle year: $t_1 = 2019$.
  • Center of the second half (2021, 2022, 2023) is the middle year: $t_2 = 2022$.

We have two points for the trend line: (2019, 105) and (2022, 121.67).

4. Find the Equation of the Trend Line:

Let the trend equation be $T_t = a + bt$, where $t$ represents the year.

The slope ($b$) is the change in value divided by the change in time:

$$b = \frac{\bar{Y}_2 - \bar{Y}_1}{t_2 - t_1} = \frac{121.67 - 105}{2022 - 2019}$$

[Slope formula]

$$b = \frac{16.67}{3} \approx 5.556$$

Now, find the intercept ($a$) using one of the points, say $(t_1, \bar{Y}_1) = (2019, 105)$, and the equation $T_t = a + bt$.

$$105 = a + 5.556 \times 2019$$

(Substituting $t=2019, T_t=105, b=5.556$)

$$105 = a + 11227.524$$

$$a = 105 - 11227.524 = -11122.524$$

The equation of the trend line is: $$T_t = -11122.524 + 5.556 t$$ (where $t$ is the year).

Alternatively, we can use a coded time variable starting from 0 or 1 to simplify the intercept calculation. Let's use $t=1$ for 2018, $t=2$ for 2019, ..., $t=6$ for 2023.

The time points for plotting become the midpoints of the coded time periods:

  • First half (t=1, 2, 3): Center is $(1+3)/2 = 2$. So, point is (2, 105).
  • Second half (t=4, 5, 6): Center is $(4+6)/2 = 5$. So, point is (5, 121.67).

Slope ($b$):

$$b = \frac{121.67 - 105}{5 - 2} = \frac{16.67}{3} \approx 5.556$$

[Slope using coded time]

Find intercept ($a$) using point (2, 105):

$$105 = a + 5.556 \times 2$$

(Substituting coded $t=2, T_t=105, b=5.556$)

$$105 = a + 11.112$$

$$a = 105 - 11.112 = 93.888$$

The equation of the trend line is: $$T_t = 93.888 + 5.556 t$$ (where $t$ is the coded year, $t=1$ for 2018).

5. Plotting: Plot the original data and the line passing through (2019, 105) and (2022, 121.67) (or using coded time, through (2, 105) and (5, 121.67)).

Plot showing original data points and the trend line passing through the two semi-average points calculated in the example.

*(Image shows the original 6 data points and a straight line drawn through the approximate positions of (2019, 105) and (2022, 121.67).)*


Example 2. Find the trend line using the method of semi-averages for the following data (Odd number of periods):

YearValue ($Y_t$)
201750
201855
201965
202060
202170
202275
202380

Answer:

Given: Time series data for 7 years.

To Find: Trend line using Semi-Averages method.

Solution:

1. Divide Data: The number of years is $n=7$, which is odd. Omit the middle year (2020, the 4th year). The remaining $7-1=6$ years are divided into two equal halves of $6/2 = 3$ years each.

  • First Half: Years 2017, 2018, 2019 (Values: 50, 55, 65)
  • Second Half: Years 2021, 2022, 2023 (Values: 70, 75, 80)

2. Calculate Semi-Averages:

  • Average of the first half: $\bar{Y}_1 = \frac{50 + 55 + 65}{3} = \frac{170}{3} \approx 56.67$.
  • Average of the second half: $\bar{Y}_2 = \frac{70 + 75 + 80}{3} = \frac{225}{3} = 75$.

3. Determine Time Points for Plotting:

  • Center of the first half (2017, 2018, 2019) is the middle year: $t_1 = 2018$.
  • Center of the second half (2021, 2022, 2023) is the middle year: $t_2 = 2022$.

We have two points for the trend line: (2018, 56.67) and (2022, 75).

4. Plot the Trend Line: Plot these two points and draw a straight line passing through them. This line represents the estimated linear trend for the series using the method of semi-averages.

Plot showing original data points and the trend line passing through (2018, 56.67) and (2022, 75).

*(Image shows the original 7 data points and a straight line drawn through the approximate positions of (2018, 56.67) and (2022, 75). The point for 2020 might be noticeably off the line).*

Advantages

Disadvantages

Due to its assumption of linearity and sensitivity to extreme values, the Method of Semi-Averages is a relatively crude method and is not recommended for complex time series or when a high degree of accuracy is required. It is more useful as an illustrative method or for very simple trend analysis.


Summary for Competitive Exams - Method of Semi-Averages

Method of Semi-Averages: Simple, objective method to fit a linear trend line.

Procedure:

  1. Divide data into two equal (or near equal, omitting middle if odd) halves.
  2. Calculate mean ($ \bar{Y} $) for each half.
  3. Plot means at the time midpoint of their respective halves ($t_1, \bar{Y}_1$) and ($t_2, \bar{Y}_2$).
  4. Draw a straight line through these two points.
  5. Equation: $T_t = a + bt$, where $b = (\bar{Y}_2 - \bar{Y}_1) / (t_2 - t_1)$.

Advantages: Objective, simple calculation, provides a linear equation.

Disadvantages: **Assumes linear trend only**, sensitive to outliers, uses limited information (only two summary points), ignores middle point if $n$ is odd.

Suitable only for simple cases or illustration.



Moving Average Method (Calculation and Smoothing)

Concept

The **Moving Average Method** is a widely used technique in time series analysis primarily for **smoothing** the data. Its main purpose is to eliminate or significantly reduce the effect of short-term fluctuations, such as seasonal variations ($S_t$) and irregular variations ($I_t$), to reveal the underlying longer-term pattern, which is typically the combination of the trend ($T_t$) and the cyclical component ($C_t$).

The method works by calculating a sequence of arithmetic means (averages) of the data values over a fixed-size "window" of consecutive periods. This window moves forward one period at a time, generating a new average for each position of the window. The term "moving" highlights that the set of data points included in the average changes as the window shifts.

The sequence of moving averages obtained is considered an estimate of the **Trend-Cycle component** of the time series, which could be additive ($T_t + C_t$) or multiplicative ($T_t \times C_t$).

Procedure

The calculation of moving averages involves the following steps:

  1. **Choose the Period ($k$):** Select the length of the moving average window, denoted by $k$. The choice of $k$ is critical. To effectively smooth out seasonality, the period $k$ should ideally be equal to the length of the seasonal cycle (e.g., $k=12$ for monthly data with yearly seasonality, $k=4$ for quarterly data with yearly seasonality, $k=7$ for daily data with weekly seasonality). Using a period that corresponds to the cycle length ensures that each moving average includes exactly one full set of seasonal observations, thus averaging out the seasonal effect.
  2. **Calculate Moving Totals:** Sum the values of the first $k$ observations ($Y_1, Y_2, \dots, Y_k$). This is the first moving total. Then, shift the window one period forward, drop the first value ($Y_1$), and add the $(k+1)$-th value ($Y_{k+1}$) to the sum of the previous $k-1$ values ($Y_2, \dots, Y_k$) to get the second moving total ($\sum_{i=2}^{k+1} Y_i$). Continue this process throughout the series. The moving total for a window starting at time $t$ is $\sum_{i=t}^{t+k-1} Y_i$.
  3. **Calculate Moving Averages:** Divide each moving total by the period $k$. The moving average for a window starting at time $t$ is $\frac{1}{k} \sum_{i=t}^{t+k-1} Y_i$.
  4. **Center the Average (Crucial for Even Periods):** The moving average calculated in the previous step is located at the time point that represents the center of the $k$ periods included in the calculation.
    • If the period $k$ is **odd** (e.g., 3-year, 5-year moving average), the center of the window falls precisely on a data point. For a 3-period moving average of $Y_1, Y_2, Y_3$, the average is centered at period 2. For a 5-period average of $Y_1, \dots, Y_5$, it's centered at period 3. In general, for odd $k$, the moving average of $Y_t, \dots, Y_{t+k-1}$ is centered at time $t + (k-1)/2$.
    • If the period $k$ is **even** (e.g., 4-quarterly, 12-monthly moving average), the center of the window falls exactly *between* two consecutive time points. For a 4-period average of $Y_1, Y_2, Y_3, Y_4$, the average is centered between period 2 and period 3 (at time 2.5). An average value between two time points is not directly aligned with the original data points. To align the moving average with the original time points, a further step is required: calculate a **2-period moving average of the initial moving averages**. This process is called "centering the moving average". The centered moving average for periods $t$ and $t+1$ (which are themselves moving averages centered between original time points) will be centered at time $t + 0.5$, aligning with the original time point $t+1$. For example, the 4-quarterly moving average initially calculated falls between quarters. Taking a 2-period moving average of these averages centers the value at the quarter end. This is often referred to as a $k \times 2$ moving average (e.g., $4 \times 2$ moving average for quarterly data).

The resulting series of (centered) moving averages provides a smoothed representation of the original time series, which is taken as the estimate of the Trend-Cycle component.


Example 1. Calculate the 3-yearly moving averages for the following data:

YearValue ($Y_t$)
201710
201812
201911
202014
202115
202213
202316

Answer:

Given: Yearly time series data.

To Find: 3-yearly moving averages.

Solution:

The period of the moving average is $k=3$, which is odd. The 3-year moving average will be centered at the middle year of the 3-year window.

Year (t) Value ($Y_t$) 3-Year Moving Total (Centered between years) 3-Year Moving Average (Centered at middle year)
2017 10 - -
2018 12 $10+12+11 = 33$ $33 / 3 = 11.00$
2019 11 $12+11+14 = 37$ $37 / 3 \approx 12.33$
2020 14 $11+14+15 = 40$ $40 / 3 \approx 13.33$
2021 15 $14+15+13 = 42$ $42 / 3 = 14.00$
2022 13 $15+13+16 = 44$ $44 / 3 \approx 14.67$
2023 16 - -

The 3-yearly moving averages are calculated for the years 2018 through 2022. These values represent the estimated trend component for those respective years, having smoothed out shorter-term fluctuations.

Note that the moving average is calculated for $n-k+1 = 7-3+1 = 5$ periods (2018 to 2022). Data points are lost at the beginning and end.


Example 2. Calculate the 4-quarterly centered moving averages for the following quarterly data:

YearQuarterValue ($Y_t$)
2022Q130
Q240
Q335
Q450
2023Q135
Q245
Q340
Q455

Answer:

Given: Quarterly time series data.

To Find: 4-quarterly centered moving averages.

Solution:

The period of the moving average is $k=4$, which is even. We need to calculate a $4$-period moving average and then center it using a $2$-period moving average of the $4$-period averages.

Year Quarter Value ($Y_t$) 4-Quarter Moving Total
(Centered between Q)
4-Quarter Moving Average
(Centered between Q)
2-Period Moving Total
of 4-Q M.A.
4-Quarter Centered M.A.
(Centered at Quarter)
(Centered at Quarter) (Centered at Quarter)
2022 Q1 30 - - --
Q2 40 - ---
Q3 35 $30+40+35+50 = 155$ $155/4=38.75$ - -
Q4 50 $40+35+50+35 = 160$ $160/4=40.00$ $38.75+40.00=78.75$ $78.75 / 2 = 39.375$
2023 Q1 35 $35+50+35+45 = 165$ $165/4=41.25$ $40.00+41.25=81.25$ $81.25 / 2 = 40.625$
Q2 45 $50+35+45+40 = 170$ $170/4=42.50$ $41.25+42.50=83.75$ $83.75 / 2 = 41.875$
Q3 40 $35+45+40+55 = 175$ $175/4=43.75$ $42.50+43.75=86.25$ $86.25 / 2 = 43.125$
Q4 55 - ---

The 4-quarterly centered moving averages (shown in the last column) provide a smoothed estimate of the trend-cycle component, aligned with the quarters from 2022 Q4 to 2023 Q3.

Note that we lose $k/2 = 4/2 = 2$ data points at the beginning and $k/2=2$ data points at the end due to the calculation process, for a total loss of $k$ points. For a $4 \times 2$ moving average, we actually lose $k-1 = 3$ data points at each end, as the first centered average is for period $(k/2 + (k/2-1)) / 2$ for the original series? Let's check the time points carefully for even $k=4$.

Moving average for $Y_1, Y_2, Y_3, Y_4$ is centered at time 2.5.

Moving average for $Y_2, Y_3, Y_4, Y_5$ is centered at time 3.5.

The 2-period average of these two moving averages is centered at time $(2.5+3.5)/2 = 3$. This corresponds to the 3rd period (Q3 in this example). So the first centered moving average is for Q3 of 2022.

The window for the first moving average is Q1 2022 to Q4 2022. Its average is centered between Q2 and Q3. The window for the second moving average is Q2 2022 to Q1 2023. Its average is centered between Q3 and Q4. The average of these two (centering) is centered at Q3 2022. So the first centered MA is for Q3 2022.

The number of centered moving averages is $n-k+1 - 1 = n-k$. For $n=8, k=4$, this is $8-4=4$ averages. These are for Q4 2022, Q1 2023, Q2 2023, Q3 2023. My table shows calculation up to Q3 2023, which is 4 values. The calculation seems correct, the number of lost points at each end is $k/2 + (2-1)/2 = k/2 + 0.5 = (k+1)/2$ if $k$ is odd, and $k/2 + 1$ if $k$ is even? No, for $k=4$, the first average is for period 2.5. The 2-period average of MA starts at period 3. So periods 1 and 2 are lost at the start. $k/2 = 2$ periods lost at start. At the end, for $n=8$, the last 4-period MA is $Y_5, Y_6, Y_7, Y_8$ (Q1 2023 to Q4 2023), centered at 6.5. The previous one is $Y_4, Y_5, Y_6, Y_7$ (Q4 2022 to Q3 2023), centered at 5.5. The centered average of 5.5 and 6.5 is 6. This corresponds to Q2 2023. Ah, my table calculation for the time center is off. Let's re-center based on the time point of the *original* data.

A 3-year MA of Yrs 1, 2, 3 is centered at Yr 2.

A 4-Q MA of Q1, Q2, Q3, Q4 is centered between Q2 and Q3.

A 2-period MA of values centered at 2.5 and 3.5 is centered at 3.

So for $k=4$: MA of $Y_1,Y_2,Y_3,Y_4$ (center 2.5), MA of $Y_2,Y_3,Y_4,Y_5$ (center 3.5). Centered MA is avg of these, centered at 3. So the first centered MA is for period 3 of the original series.

Original periods: 2022 Q1 (1), Q2 (2), Q3 (3), Q4 (4), 2023 Q1 (5), Q2 (6), Q3 (7), Q4 (8).

First 4-Q MA window: 1,2,3,4. Avg = 38.75, centered at 2.5.

Second 4-Q MA window: 2,3,4,5. Avg = 40.00, centered at 3.5.

First Centered MA: Avg of 38.75 (at 2.5) and 40.00 (at 3.5). $(38.75+40)/2=39.375$. Centered at $(2.5+3.5)/2 = 3$. Period 3 is 2022 Q3. So the first centered MA is for 2022 Q3.

Next 4-Q MA window: 3,4,5,6. Avg = 41.25, centered at 4.5.

Second Centered MA: Avg of 40.00 (at 3.5) and 41.25 (at 4.5). $(40+41.25)/2=40.625$. Centered at $(3.5+4.5)/2 = 4$. Period 4 is 2022 Q4. So the second centered MA is for 2022 Q4.

Next 4-Q MA window: 4,5,6,7. Avg = 42.50, centered at 5.5.

Third Centered MA: Avg of 41.25 (at 4.5) and 42.50 (at 5.5). $(41.25+42.5)/2=41.875$. Centered at $(4.5+5.5)/2 = 5$. Period 5 is 2023 Q1. So the third centered MA is for 2023 Q1.

Next 4-Q MA window: 5,6,7,8. Avg = 43.75, centered at 6.5.

Fourth Centered MA: Avg of 42.50 (at 5.5) and 43.75 (at 6.5). $(42.5+43.75)/2=43.125$. Centered at $(5.5+6.5)/2 = 6$. Period 6 is 2023 Q2. So the fourth centered MA is for 2023 Q2.

My table's centering column labels and values were slightly off regarding which period it centers on. Let's fix the table labels and entries.

Year Quarter Value ($Y_t$) 4-Quarter Moving Total
(Centered between quarters)
4-Quarter Moving Average
(Centered between quarters)
2-Period Moving Total
of 4-Q M.A.
4-Quarter Centered M.A.
(Centered at Quarter)
(Centered at Quarter) (Centered at Quarter)
2022 Q1 30 - - --
Q2 40 - ---
Q3 35 $30+40+35+50 = 155$ $155/4=38.75$ (between Q2 & Q3) $38.75+40.00=78.75$ $78.75 / 2 = 39.375$ (Centered at Q3)
Q4 50 $40+35+50+35 = 160$ $160/4=40.00$ (between Q3 & Q4) $40.00+41.25=81.25$ $81.25 / 2 = 40.625$ (Centered at Q4)
2023 Q1 35 $35+50+35+45 = 165$ $165/4=41.25$ (between Q4 & Q1 2023) $41.25+42.50=83.75$ $83.75 / 2 = 41.875$ (Centered at Q1 2023)
Q2 45 $50+35+45+40 = 170$ $170/4=42.50$ (between Q1 & Q2 2023) $42.50+43.75=86.25$ $86.25 / 2 = 43.125$ (Centered at Q2 2023)
Q3 40 $35+45+40+55 = 175$ $175/4=43.75$ (between Q2 & Q3 2023) - -
Q4 55 - ---

The centered moving averages for this data are approximately 39.38 (for 2022 Q3), 40.63 (for 2022 Q4), 41.88 (for 2023 Q1), and 43.13 (for 2023 Q2).

This method results in a loss of data points at the beginning and end equal to $(k-1)$ periods for odd $k$ (total $k-1$ points lost), and $(k/2) + (k/2-1) + (1) = k-1$ periods for even $k$ where 2-period centering is used (actually $k/2$ periods at start and $k/2-1$ at the end for the first MA, then 1 more at end for centering). Total lost is $k-1+1 = k$ for centered. Let's be precise: For odd $k$, $(k-1)/2$ points are lost at each end. Total $k-1$ points lost. For even $k$, $k/2$ points are lost by the initial moving average, and 1 more by the 2-period moving average of MAs at the end. Total lost is $k/2$ at the start and $k/2$ at the end. Total $k$ points lost.

In Example 1 ($k=3$), we lost $(3-1)/2 = 1$ point at the start (2017) and 1 at the end (2023). Total $1+1=2$ points lost, which is $k-1$. Correct.

In Example 2 ($k=4$), we lost $4/2 = 2$ points at the start (2022 Q1, Q2) and $4/2 = 2$ points at the end (2023 Q3, Q4). Total $2+2=4$ points lost, which is $k$. Correct.

Advantages

Disadvantages

Despite its limitations, the Moving Average method is a powerful visual and preliminary analytical tool and a fundamental step in classical time series decomposition. It is valued for its ability to reveal underlying patterns obscured by short-term noise.


Summary for Competitive Exams - Moving Average Method

Moving Average Method: Smoothing technique to estimate Trend-Cycle ($T \times C$ or $T+C$).

Procedure:

  1. Choose period $k$ (ideally seasonal length).
  2. Calculate moving totals over windows of size $k$.
  3. Calculate moving averages by dividing totals by $k$.
  4. Centering (if $k$ is even): Calculate a 2-period moving average of the initial moving averages.

Advantages: Simple, effective smoothing (especially of seasonality), flexible (follows non-linear trends), useful for decomposition.

Disadvantages: **Loses data points** at ends (total $k-1$ for odd $k$, $k$ for even $k$ centered); no mathematical equation for trend (hinders extrapolation); choice of $k$ is crucial; sensitive to outliers.

Provides a smoothed series, not an explicit trend line/curve equation.



Method of Least Squares (Fitting a Straight Line Trend)

Concept

The **Method of Least Squares** is a widely used mathematical and statistical technique for finding the "best fitting" straight line or curve to a set of data points. In the context of time series analysis, it is employed to fit a mathematical function that represents the long-term trend component ($T_t$). The most basic application is fitting a straight line trend, assuming that the trend component can be approximated by a linear relationship with time.

A straight line can be represented by the equation:

$$\mathbf{Y_c = a + bX}$$

... (i)

Where:

The Method of Least Squares determines the values of the coefficients ($a$ and $b$) such that the sum of the squares of the vertical distances (or errors, or residuals) between the actual observed values ($Y$) and the values predicted by the trend line ($Y_c$) is minimized. That is, we want to find $a$ and $b$ that minimize $\sum (Y - Y_c)^2$ over all data points. Substituting $Y_c = a + bX$, the objective is to minimize:

$$\sum (Y - (a + bX))^2$$

Normal Equations

Using principles of calculus (specifically, finding the minimum by taking partial derivatives with respect to $a$ and $b$ and setting them equal to zero), the values of $a$ and $b$ that minimize the sum of squared errors can be found by solving the following system of two linear equations, known as the **normal equations**:

$$\sum Y = na + b\sum X$$

... (II)

$$\sum XY = a\sum X + b\sum X^2$$

... (III)

Where:

To find $a$ and $b$, you would typically calculate the sums $\sum Y, \sum X, \sum XY, \sum X^2$ from your data and then solve the two simultaneous equations (II and III).

Simplification using Coded Time

Solving simultaneous equations can be tedious, especially manually. The calculations for $a$ and $b$ are significantly simplified if the time variable $X$ is coded in such a way that its sum is zero, i.e., $\sum X = 0$. This is achieved by setting the origin (where $X=0$) at the center of the time series.

When the time variable $X$ is coded such that $\sum X = 0$, the normal equations simplify:

These simplified formulas make calculating $a$ and $b$ much easier, especially when using coded time originating at the series' center.


Example 1. Fit a straight line trend by the method of least squares to the following data and estimate the sales value for 2025.

YearSales ($\textsf{₹}$ Lakhs) ($Y$)
201970
202075
202180
202285
202390

Answer:

Given: Yearly sales data for 5 years.

To Find: Straight line trend equation using Least Squares and forecast for 2025.

Solution:

The number of years is $n=5$, which is odd. We will use coded time $X$ with the origin (where $X=0$) at the middle year, 2021. The equation of the trend line is $Y_c = a + bX$.

1. Set up the calculation table and code time (X):

Year Sales ($\textsf{₹}$ Lakhs) ($Y$) Time Code $X$
(Origin at 2021)
$XY$ $X^2$
2019 70 -2 $(-2) \times 70 = -140$ $(-2)^2 = 4$
2020 75 -1 $(-1) \times 75 = -75$ $(-1)^2 = 1$
2021 80 0 $0 \times 80 = 0$ $0^2 = 0$
2022 85 1 $1 \times 85 = 85$ $1^2 = 1$
2023 90 2 $2 \times 90 = 180$ $2^2 = 4$
Total ($\sum$) $\sum Y=400$ $\sum X=0$ $\sum XY = -140 - 75 + 0 + 85 + 180 = 50$ $\sum X^2 = 4 + 1 + 0 + 1 + 4 = 10$

Number of observations $n=5$. We observe that $\sum X = 0$, which simplifies the formulas for $a$ and $b$.

2. Calculate $a$ and $b$ using simplified formulas:

$$a = \frac{\sum Y}{n} = \frac{400}{5} = 80$$

[From simplified normal equation]

$$b = \frac{\sum XY}{\sum X^2} = \frac{50}{10} = 5$$

[From simplified normal equation]

3. Write the Trend Equation:

Substitute the calculated values of $a$ and $b$ into the trend equation $Y_c = a + bX$ (Equation i).

$$Y_c = 80 + 5X$$

(Origin: Year 2021; Unit of $X$: 1 Year)

This equation allows us to calculate the estimated trend value for any given year $X$ relative to the origin 2021.

4. Estimate Sales for 2025:

To estimate the sales for the year 2025, we need to find the corresponding value of the coded time variable $X$.

The origin $X=0$ is at year 2021. The year 2025 is $2025 - 2021 = 4$ years away from the origin in the positive direction.

So, for the year 2025, $X = 4$.

Substitute $X=4$ into the trend equation:

$$Y_c (2025) = 80 + 5 \times 4$$

(Substituting $X=4$ into trend equation)

$$Y_c (2025) = 80 + 20$$

$$Y_c (2025) = 100$$

The estimated sales for the year 2025, based on this linear trend model, are $\textsf{₹}100$ Lakhs.

Advantages

Disadvantages

Despite some limitations, the Method of Least Squares is generally considered the most statistically robust method for trend estimation when an appropriate functional form can be assumed. It is widely used in practice due to its objectivity, mathematical rigor, and usefulness for forecasting.


Summary for Competitive Exams - Method of Least Squares (Linear Trend)

Method of Least Squares: Objective statistical method to fit the "best" trend line/curve by minimizing $\sum (Y - Y_c)^2$.

Linear Trend Equation: $$Y_c = a + bX$$ (where $X$ is time variable).

Normal Equations (to find $a$ and $b$):

Simplification using Coded Time ($\sum X = 0$):

If $\sum X = 0$:

Advantages: Objective, provides mathematical equation ($Y_c = a+bX$), allows forecasting, statistically sound, adaptable to non-linear forms.

Disadvantages: Assumes specific functional form (e.g., linear), calculations can be complex, sensitive to outliers.



Method of Least Squares (Fitting a Parabolic Trend)

Concept

While a straight line often serves as a reasonable approximation for the long-term trend, some time series exhibit a trend that is not linear. If the trend shows a consistent curvature (e.g., sales growth initially slow, then rapid, then slowing down again, forming an S-shape in the long run, or a product lifecycle peaking and then declining), a more complex function is needed to capture the curvature. A common mathematical form used to model such a non-linear trend, particularly one with a single bend or point of inflection, is a **parabola**, which is a second-degree polynomial.

The equation for a parabolic trend is:

$$\mathbf{Y_c = a + bX + cX^2}$$

... (i)

Where:

Similar to the linear case, the **Method of Least Squares** is used to find the values of $a$, $b$, and $c$ that make the fitted parabolic curve the "best fit" for the observed data points $Y$. This is achieved by minimizing the sum of the squares of the vertical differences (residuals) between the actual values ($Y$) and the trend values predicted by the parabolic equation ($Y_c$).

Minimize $\sum (Y - Y_c)^2 = \sum (Y - (a + bX + cX^2))^2$ over all data points.

Normal Equations

To find the values of $a$, $b$, and $c$ that minimize the sum of squared errors, we use calculus by taking partial derivatives of $\sum (Y - (a + bX + cX^2))^2$ with respect to $a$, $b$, and $c$ and setting these derivatives to zero. This process results in a system of three simultaneous linear equations, known as the **normal equations** for fitting a parabolic trend:

$$\sum Y = na + b\sum X + c\sum X^2$$

... (II)

$$\sum XY = a\sum X + b\sum X^2 + c\sum X^3$$

... (III)

$$\sum X^2Y = a\sum X^2 + b\sum X^3 + c\sum X^4$$

... (IV)

Where:

Simplification using Coded Time

Solving three simultaneous equations can be complex. However, just as in the linear case, the calculations are considerably simplified if the time variable $X$ is coded such that the origin ($X=0$) is set at the center of the time series. If $X$ is coded symmetrically around the middle period (using increments of 1 for odd $n$ or increments of 2 for even $n$, as described for linear trend), then:

If the time coding ensures $\sum X = 0$ and $\sum X^3 = 0$, the normal equations (II, III, IV) simplify as follows:

Equation (II): $\sum Y = na + b(0) + c\sum X^2 \implies \mathbf{\sum Y = na + c\sum X^2}$ ... (V)

Equation (III): $\sum XY = a(0) + b\sum X^2 + c(0) \implies \mathbf{\sum XY = b\sum X^2} \implies \mathbf{b = \frac{\sum XY}{\sum X^2}}$ ... (VI)

Equation (IV): $\sum X^2Y = a\sum X^2 + b(0) + c\sum X^4 \implies \mathbf{\sum X^2Y = a\sum X^2 + c\sum X^4}$ ... (VII)

With this simplification:

This coded time approach significantly streamlines the process of fitting a parabolic trend using least squares.


Example 1. Fit a parabolic trend ($Y_c = a + bX + cX^2$) by the method of least squares to the following data and estimate the value for 2026.

YearProduction (units) ($Y$)
201820
201925
202028
202130
202229
202327
202423

Answer:

Given: Yearly production data for 7 years.

To Find: Parabolic trend equation ($Y_c = a + bX + cX^2$) using Least Squares and forecast for 2026.

Solution:

The number of years is $n=7$, which is odd. We will use coded time $X$ with the origin (where $X=0$) at the middle year, 2021. The equation of the parabolic trend is $Y_c = a + bX + cX^2$.

1. Set up the calculation table and code time (X):

Year Production ($Y$) Time Code $X$
(Origin at 2021)
$X^2$ $X^3$ $X^4$ $XY$ $X^2Y$
2018 20 -3 9 -27 81 -60 180
2019 25 -2 4 -8 16 -50 100
2020 28 -1 1 -1 1 -28 28
2021 30 0 0 0 0 0 0
2022 29 1 1 1 1 29 29
2023 27 2 4 8 16 54 108
2024 23 3 9 27 81 69 207
Total ($\sum$) $\sum Y=182$ $\sum X=0$ $\sum X^2=28$ $\sum X^3=0$ $\sum X^4=196$ $\sum XY=14$ $\sum X^2Y=652$

Number of observations $n=7$. We observe that $\sum X = 0$ and $\sum X^3 = 0$, which simplifies the formulas for $a, b, c$.

2. Calculate $a$, $b$, and $c$ using simplified normal equations:

From Equation (VI), $b = \frac{\sum XY}{\sum X^2}$:

$$b = \frac{14}{28} = 0.5$$

[From Eq. (VI)]

From Equations (V) and (VII), we have a system for $a$ and $c$:

$$\sum Y = na + c\sum X^2 \implies 182 = 7a + 28c$$

$$\sum X^2Y = a\sum X^2 + c\sum X^4 \implies 652 = 28a + 196c$$

We can solve this system. Divide the first equation by 7 and the second by 28 to simplify:

Equation (V'): $182/7 = a + (28/7)c \implies 26 = a + 4c$

Equation (VII'): $652/28 = a + (196/28)c \implies 23.2857 \approx a + 7c$

Now solve the system:

$(a + 7c) - (a + 4c) \approx 23.2857 - 26$

$3c \approx -2.7143$

$$c \approx \frac{-2.7143}{3} \approx -0.9048$$

Substitute $c \approx -0.9048$ into Equation (V'): $26 = a + 4(-0.9048)$

$26 = a - 3.6192$

$$a = 26 + 3.6192 \approx 29.6192$$

So, $a \approx 29.6192$, $b = 0.5$, and $c \approx -0.9048$.

3. Write the Trend Equation:

Substitute the calculated values of $a$, $b$, and $c$ into the parabolic trend equation $Y_c = a + bX + cX^2$ (Equation i).

$$Y_c \approx 29.6192 + 0.5X - 0.9048X^2$$

(Origin: Year 2021; Unit of $X$: 1 Year)

This equation represents the estimated parabolic trend.

4. Estimate Production for 2026:

To estimate the production for the year 2026, we need to find the corresponding value of the coded time variable $X$.

The origin $X=0$ is at year 2021. The year 2026 is $2026 - 2021 = 5$ years away from the origin in the positive direction.

So, for the year 2026, $X = 5$.

Substitute $X=5$ into the trend equation:

$$Y_c (2026) \approx 29.6192 + 0.5(5) - 0.9048(5^2)$$

(Substituting $X=5$)

$$Y_c (2026) \approx 29.6192 + 2.5 - 0.9048(25)$$

$$Y_c (2026) \approx 29.6192 + 2.5 - 22.62$$

$$Y_c (2026) \approx 32.1192 - 22.62$$

$$Y_c (2026) \approx 9.4992$$

The estimated production for the year 2026, based on this parabolic trend model, is approximately $9.50$ units.

Since $c \approx -0.9048$ is negative, the parabola opens downwards, consistent with the production values rising and then falling towards the end of the observed period (2021 peak).

Advantages

Disadvantages

Fitting a parabolic trend using least squares is suitable when the time series data visibly exhibits a non-linear curvature that a second-degree polynomial can approximate. It offers a more flexible trend model than a straight line but requires careful consideration of the data's pattern and the potential pitfalls of extrapolation.


Summary for Competitive Exams - Methods of Measuring Trend

Methods for Measuring Trend ($T_t$):

  • Freehand Curve: Simple, visual, **subjective**. No equation. (Least rigorous)
  • Semi-Averages: Objective linear trend. Simple, but **assumes linearity only** and sensitive to outliers.
  • Moving Average: **Smoothes data**, estimates Trend-Cycle. Requires period $k$. **Loses end data**, no explicit equation. (Good for decomposition)
  • Least Squares: **Objective**, fits mathematical curve by minimizing $\sum(Y-Y_c)^2$. Gives equation, allows forecasting.
    • Linear Trend ($Y_c = a+bX$): Fits a straight line. Normal equations: $\sum Y = na + b\sum X$, $\sum XY = a\sum X + b\sum X^2$. Simplified if $\sum X=0$.
    • Parabolic Trend ($Y_c = a+bX+cX^2$): Fits a curve with one bend. Normal equations: $\sum Y = na + b\sum X + c\sum X^2$, $\sum XY = a\sum X + b\sum X^2 + c\sum X^3$, $\sum X^2Y = a\sum X^2 + b\sum X^3 + c\sum X^4$. Simplified if $\sum X=0, \sum X^3=0$.

Time Coding ($X$): Symmetric coding around the center ($\sum X=0$, $\sum X^{odd}=0$) greatly simplifies Least Squares calculations for polynomial trends.

Least Squares is generally preferred for analytical rigor and forecasting when the trend form is appropriate.